System and method for dynamic binding of a spreadsheet with external parameters

ABSTRACT

In one aspect, the present invention relates to utilizing a spreadsheet by defining a parameter external to the spreadsheet and associating the parameter with the spreadsheet to define a parameterized workbook. In one embodiment, this utilization further comprises storing a location of the spreadsheet and storing the name of the parameter in the same storage module as the location. In another embodiment, it includes defining a result external to the spreadsheet, the result referencing one or more cells within the spreadsheet. in another embodiment, this utilization further comprises receiving a value for the parameter and generating the result based at least in part on the value and the spreadsheet. In another embodiment, it further comprises associating a type with the parameter. The type can define a range of values or attributes associated with the parameter.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of parent U.S. patent application Ser.No. 10/193,015, filed Jul. 11, 2002, entitled “System and Method forEfficiently and Flexibly Utilizing Spreadsheet Information,”, theentirety of which is incorporated herein by reference.

TECHNICAL FIELD

This invention relates generally to computer-based systems and methodsfor data processing, and more particularly relates to systems andmethods for manipulating data, for example, spreadsheet applicationprograms.

BACKGROUND

Generally, spreadsheet application programs are used to automatenumerical and symbolic calculations for business, financial andscientific uses. Spreadsheet programs are the tools of choice for manybusiness and analysis tasks because they combine a very usable graphicalinterface with a simple formula language that allows non-programmers,within the limits of the simple formula language, to createcomputational models.

Spreadsheet programs visually present numeric and non-numeric data in atwo-dimensional grid for easy assimilation by the reader. Each elementof the two-dimensional grid is referred to as a cell. A cell can containeither a data value, or it can contain a formula that calculates a newvalue based on the values of other cells. Spreadsheet cells that containformulas are automatically recalculated when there are changes to theother cells that the formula depends upon. This mechanism allows aspreadsheet user to perform what-if scenarios only by modifying cellvalues and viewing or saving the effects of the changes.

Individual two-dimensional spreadsheets can be organized into a largerentity known as a notebook or workbook. The terms spreadsheet orworksheet will be used interchangeably herein, as will the termsnotebook or workbook.

When worksheets are grouped together to form a workbook, the workbookand all of its worksheets are stored together as a single file (i.e. theworkbook becomes the unit of storage and transfer when moving databetween the program 's memory space and disk storage). Formulas storedin worksheet cells can reference other cells that are in worksheets inthe same workbook and/or cells that are in worksheets in a differentworkbook. When formulas refer to a different workbook, however, knownprior art spreadsheet programs cannot automatically recalculate formulavalues unless that other workbook has also been loaded into theprogram's memory space. These prior art spreadsheet programs do notallow multiple copies of the same workbook to be loaded into memorysimultaneously and they do not allow multiple what-if scenarios to becalculated simultaneously.

In existing spreadsheet programs, data values can be stored in cellseither by user input (directly or through a user-input formula) or bythe user associating the cells with an external data source, such as aquery to a database. When cells are associated with an external datasource, the user of the program can control how often the external datasource should be checked for changes. However, no mechanism exists toautomatically detect when data referenced from the data source haschanged and to recalculate the spreadsheet if, and only if, such achange has occurred.

Spreadsheets in prior art spreadsheet programs perform at most a singlecalculation with fixed inputs to arrive at a fixed set of answers.Changes in external data sources can affect the results of computationsperformed by a spreadsheet, but the computation itself is fixed. Priorart spreadsheet programs make the user vary cell values if the userwants to perform different calculations, including what-if scenarios.

Some prior art spreadsheet programs support various facilities forprogrammatic control over the spreadsheets to automate spreadsheettasks. Some of these programs implement a macro recording and playbackfacility, which allows repetitive tasks to be automated by recording theuser's actions and later replaying them. Some programs publishapplication programming interfaces (APIs) that allow computer programsto be written to manipulate the spreadsheets or to extend the userinterface of the program. Because programming skills are required to usethese application programming interfaces, they are not used by mostspreadsheet users.

Some prior art spreadsheet programs cache internally the results ofintermediate calculations (cell values) to improve the speed ofrecalculation. Prior art spreadsheet programs support re-execution ofexternal data queries either when manually requested, on a periodicbasis while the workbook is loaded into the program's memory, orwhenever the workbook is opened. None of these methods are optimal, andnone of these methods are effective when the spreadsheet program is notrunning.

SUMMARY

In one aspect, the present invention relates to a method for utilizing aspreadsheet. The method comprises defining a parameter external to thespreadsheet and associating the parameter with the spreadsheet to definea parameterized workbook. In one embodiment, the method furthercomprises storing a location of the spreadsheet and storing the name ofthe parameter in the same storage module as the location. In anotherembodiment, the method further comprises defining a result external tothe spreadsheet, the result referencing one or more cells within thespreadsheet. In another embodiment, the method further comprisesreceiving a value for the parameter and generating the result based atleast in part on the value and the spreadsheet. In another embodiment,the method further comprises defining a format for the result. Inanother embodiment, the method further comprises associating a type withthe parameter, the type defining a range of values.

In another embodiment, the method further comprises associating a typewith the parameter, the type defining attributes associated with theparameter. hi another embodiment, the method further comprises defininga formula within the spreadsheet using the parameter. In anotherembodiment, the method further comprises binding a value of theparameter to an instance of the associated spreadsheet. In anotherembodiment, the parameterized workbook is a first parameterizedworkbook, and the method further comprises defining a secondparameterized workbook and referencing a result from the secondparameterized workbook in a formula in the first parameterized workbook.In another embodiment, the method further comprises storing, separatefrom the spreadsheet and after the spreadsheet closes, an intermediatevalue used in a calculation for the spreadsheet and associating theintermediate value with the spreadsheet and a value of the parameterused to calculate the intermediate value.

In yet another embodiment, the method further comprises automaticallycalculating a result associated with the spreadsheet without opening thespreadsheet by using the stored intermediate value. In anotherembodiment, the method further comprises distributing calculations ofthe spreadsheet among a plurality of computing devices. In anotherembodiment, the method further comprises distributing calculations ofthe spreadsheet among a plurality of computing devices based at least inpart on one or more formulas in the spreadsheet. In another embodiment,the method further comprises distributing the calculations of thespreadsheet among a plurality of computing devices based at least inpart on one or more stored associations in a database associated withthe parameterized workbook.

In another aspect, the invention relates to a system for utilizing aspreadsheet. The system comprises a spreadsheet and a list ofparameters. The spreadsheet comprises one or more associated formulas.The list of parameters is associated with the spreadsheet. The list ofparameters is external to the spreadsheet and one or more parameterswithin the list are referenced in the one or more associated formulas ofthe spreadsheet. In one embodiment, the system further comprises astorage module to store a location of the spreadsheet and store the nameof each parameter in the associated list of parameters. In anotherembodiment, the system further comprises a list of results associatedwith the spreadsheet. In another embodiment, the system furthercomprises a calculation module to receive a value for a first parameterwithin the list of parameters and generate a first result based at leastin part on the value and the spreadsheet. In another embodiment, thesystem further comprises an instantiator module to bind a value of afirst parameter within the list of parameters to an instance of thespreadsheet. In another embodiment, the spreadsheet is a firstspreadsheet and the system further comprises a second spreadsheet havingone or more formulas referencing a result from the first spreadsheet.

In another aspect, the invention relates to a method for utilizing aplurality of spreadsheets. The method comprises defining a parameter anddefining a set of one or more selection rules to select a firstspreadsheet from a plurality of spreadsheets based at least in part on avalue of the parameter, thereby defining a virtual workbook. In oneembodiment, the method further comprises selecting a first workbook fromthe plurality of workbooks based at least in part on a value for theparameter and the set of one or more rules. In another embodiment, themethod further comprises referencing the virtual workbook in a createdspreadsheet as a substitute for a second spreadsheet in the plurality ofspreadsheets. In another embodiment, the method further comprisesdefining a result external to the virtual workbook, the resultreferencing one or more cells within each spreadsheet in the pluralityof spreadsheets.

In another embodiment, the method further comprises receiving a valuefor the parameter and generating the result based at least in part onthe value and the first spreadsheet. In another embodiment, the methodfurther comprises defining a format for the result. In anotherembodiment, the method further comprises associating a type with theparameter, the type defining a range of values. In another embodiment,the method further comprises associating a type with the parameter, thetype defining attributes associated with the parameter. In anotherembodiment, the method further comprises defining a formula containingthe parameter within a spreadsheet in the plurality of spreadsheets. Inanother embodiment, the method further comprises binding a value of theparameter to an instance of the selected first spreadsheet. In anotherembodiment, the virtual workbook is a first virtual workbook and themethod further comprises defining a second virtual workbook andreferencing a result from the second virtual workbook in a formula inthe first virtual workbook. In another embodiment, the method furthercomprises defining a parameterized workbook and referencing a resultfrom the parameterized workbook in a formula in the virtual workbook. Inanother embodiment, the method further comprises defining aparameterized workbook and referencing a result from the virtualworkbook in a formula in the parameterized workbook.

In another embodiment, the method further comprises storing, separatefrom the first spreadsheet and after the first spreadsheet closes, anintermediate value used in a calculation for the first spreadsheet andassociating the intermediate value with the first spreadsheet and avalue of the parameter used to calculate the intermediate value. Inanother embodiment, the method further comprises automaticallycalculating a result associated with the first spreadsheet withoutopening the first spreadsheet by using the stored intermediate value. Inanother embodiment, the method further comprises distributingcalculations of the first spreadsheet among a plurality of computingdevices. In another embodiment, the method further comprisesdistributing calculations of the first spreadsheet among a plurality ofcomputing devices based at least in part on one or more formulas in thefirst spreadsheet. In another embodiment, the method further comprisesdistributing the calculations of the first spreadsheet among a pluralityof computing devices based at least in part on one or more storedassociations in a database associated with the virtual workbook.

In yet another aspect, the invention relates to a system for utilizing aplurality of spreadsheets. The system comprises a list of parameters anda selection module. The selection module selects a first spreadsheetfrom a plurality of spreadsheets. In one embodiment, the system furthercomprises a list of results associated with each of spreadsheets in theplurality of spreadsheets. In another embodiment, the system furthercomprises a calculation module to receive a value for a first parameterwithin the list of parameters and generate a first result based at leastin part on the value and the first spreadsheet. In another embodiment,the system further comprises an instantiator module to bind a value of afirst parameter within the list of parameters to an instance of thefirst spreadsheet. In another embodiment, the spreadsheet is a firstspreadsheet and the system further comprises a second spreadsheet havingone or more formulas referencing a result from the first spreadsheet. Inanother embodiment, the system further comprises a storage module tostore a location of each of the spreadsheets in the plurality ofspreadsheets and store the name of each parameter in the associated listof parameters.

In another aspect, the invention relates to a method for storinginformation associated with a spreadsheet. The method comprises storinga location of a reference spreadsheet and storing a list of parametersassociated with the reference spreadsheet. In one embodiment, the methodfurther comprises storing, separate from the reference spreadsheet andafter the reference spreadsheet closes, an intermediate value used in acalculation for a reference spreadsheet and associating the intermediatevalue with the reference spreadsheet and a value of a parameter from thelist used to calculate the intermediate value. In another embodiment,the method further comprises automatically calculating many workbookresults by enumerating lists of legal parameter values based on typeinformation associated with each respective parameter. In anotherembodiment, the method further comprises automatically calculating aresult associated with the reference spreadsheet without opening thereference spreadsheet by using the stored intermediate value. In anotherembodiment, the method further comprises storing a list of outputsassociated with the reference spreadsheet.

In yet another embodiment, the method further comprises storing anassociation of the reference spreadsheet to another spreadsheet uponwhich the reference spreadsheet depends. In another embodiment, themethod further comprises monitoring the another spreadsheet to detect achange within an output. In another embodiment, the method furthercomprises automatically performing a calculation using the referencespreadsheet in response to detecting the change within the output of theanother spreadsheet. In another embodiment, the method further comprisesstoring an association of the reference spreadsheet to a data sourceupon which a formula in the spreadsheet depends. In another embodiment,the method further comprises monitoring the data source to detect achange within the data source. In another embodiment, the method furthercomprises automatically performing a calculation using the referencespreadsheet in response to detecting the change within the data source.

In another embodiment, the method further comprises storing version dataassociated with the reference workbook. In another embodiment, thelocation comprises a network address. In another embodiment, the methodfurther comprises distributing calculations among a plurality ofcomputing devices based at least in part on one or more storedassociations. In another embodiment, the method further comprisesdistributing calculations of the reference spreadsheet among a pluralityof computing devices. In another embodiment, the method furthercomprises distributing calculations of the reference spreadsheet among aplurality of computing devices based at least in part on one or moreformulas in the reference spreadsheet. In another embodiment, the methodfurther comprises distributing the calculations of the referencespreadsheet among a plurality of computing devices based at least inpart on one or more stored associations in a database associated withthe reference spreadsheet.

In a further aspect, the invention relates to a method for utilizing aspreadsheet. The method comprises defining a parameter associated withthe spreadsheet and generating a data query based at least in part onthe parameter. In one embodiment, the method further comprisesgenerating an output based at least in part on the results of the dataquery.

In another aspect, the invention relates to a method for utilizing aspreadsheet. The method comprises defining a template within thespreadsheet and generating an output based at least in part on thetemplate. In one embodiment, the method further comprises generating adata query based at least in part on a parameter associated with thespreadsheet. In another embodiment, the one or more cells within thetemplate contain formulas. In another embodiment, the formulas arewritten in spreadsheet formula language. In another embodiment, themethod further comprises replicating one or more cells within thetemplate. In another embodiment, the method further comprises preservingrelative cell references. In another embodiment, the method furthercomprises replicating formatting of the template cells.

In yet another embodiment, the method further comprises associatingvalues from a data query with the one or more replicated cells by usingcolumn names in formulas within the one or more replicated cells andperforming calculations using the associated values. In anotherembodiment, the method further comprises performing special processingon the output when the data query returns no associated values. Inanother embodiment, the method further comprises automatically sortingthe output based at least in part on the associated values of the one ormore cells in the output. In another embodiment, the method furthercomprises associating a formula language name with the output. Inanother embodiment, the method further comprises automatically updatingthe output when a change is detected. In another embodiment, the changecomprises a change to i) template cell formulas, ii) template cellformatting, iii) template cell values, or iv) data query parameters.

In another aspect, the invention relates to another method for utilizinga spreadsheet. The method comprises defining an output range within thespreadsheet, rendering the output range and allowing a user to modifythe rendered output range. In one embodiment, the method furthercomprises rendering the output range using HTML. In another embodiment,the method further comprises allowing the user to sort columns withinthe output range using a user input. In another embodiment, the methodfurther comprises allowing a user to interactively expand and collapse ahierarchy using a user input.

In yet another aspect, the invention relates to an article ofmanufacture comprising one or more computer program portions embodiedtherein to cause a processor to perform each of the methods above.

Among other advantages, the invention described above allowsnon-programmers greater flexibility, including allowing the applicationof spreadsheets to certain kinds of business problems that are nottractable with conventional spreadsheet programs. The present inventionsderive, in part, from the observation that currently availablespreadsheet systems do not meet the needs of users who would like to usespreadsheet based systems to solve these kinds of problems.

The spreadsheet-based data processing systems efficiently perform largebusiness and financial computations based on a network of inter-relatedspreadsheets. The systems include spreadsheet modeling mechanisms thatwork in concert to allow non-programmers to model classes of problemsthat are intractable using prior art spreadsheet programs. One exampleof a business problem is the use of spreadsheets to manage salescommission programs. In a typical situation, each salesperson'scommission plan may be based on several variables, such as sales quotagoals or particular commission rates to be paid on certain sales. Eachsalesperson's plan may also vary based on their seniority or the kind ofterritory they cover. In current practice, compensation specialistsoften model the commission plans using spreadsheets. Ideally, eachsalesperson will have a separate spreadsheet customized to theirsituation. Managers will also have their own tailored commission plans,modeled as a spreadsheet, and these will often depend on the results ofthe people reporting to them. The result is a computational model thatconsists of a large web of interdependent spreadsheets, which can numberin the thousands for a large sales organization.

Existing spreadsheet programs lack effective end-user automationfunctionality to deal with models of this scale and complexity. Thesemodels may not be able to fit into the memory space of the program, andso must be broken up into multiple workbooks. Conventional programsallow for only manual management of the dependencies between theresulting workbooks, leading to mistakes. The complexity of the modelsmakes them difficult to change without introducing errors in thereferences between workbooks. The methods and systems described aboveinclude the concept of a parameterized spreadsheet, which greatlyfacilitates the reuse of spreadsheets as building blocks in largecomputations, and automated parameter-sensitive dependency tracking,which reduces errors caused by unintended sharing of workbooks or thefailure to load a dependent spreadsheet into memory or to recalculate itwhen necessary, for example because the data in a referenced data sourcehas changed.

Moreover, existing spreadsheet calculation algorithms do not workefficiently with large models that may contain hundreds or thousands ofworkbooks. The methods and systems described above allow large andcomplex spreadsheet models to be efficiently recalculated andmaintained. The ability to quickly perform recalculations in turn makesit practical to perform large what-if scenarios and to deliver on-demandcalculations.

The methods and systems described above also allow these largespreadsheet calculations to be kept up to date and to be reported uponwithout user intervention. The results of the computations can be madeavailable on demand, even in the face of continual changes to theunderlying data and the evolution of the spreadsheets themselves. Theability to access these spreadsheet models on demand allows self-serviceapplications to be created for information consumers. For example, usinga self-service web site, a salesperson can access their currentcommission calculations or a chief financial officer can view anddownload an up to date projection of the commission expenses for thecurrent quarter.

The specification frequently refers to a sales commission model toprovide examples of the inventive techniques described herein. It isunderstood, however, that the present invention is not limited in scopeto the provided example of sales commission calculations. The presentinvention is applicable to many other application domains including, butnot limited to, financial services, logistics and process modeling. Inthe domain of financial services, the present invention may be appliedto build, manage and calculate models to determine portfolio valuationsor to guide securities or commodity trading based on spreadsheet modelsdeveloped by the user. In the logistics and process modeling domains,the present invention may be used to apply user developed spreadsheetmodels to optimize the efficiency of a manufacturing facility or atransportation network.

In some embodiments, the invention relates to systems and methods forspreadsheet data processing that applies parameter controlledspreadsheet workbooks to specific data. When a workbook is applied todata sources and parameters for the workbook are selected (ifnecessary), the system and method performs the workbook calculationsupon the data sources and subsidiary workbooks (if any) to producecalculated results. Such results, in one embodiment, can be manifestedin any of several different formats.

Each workbook may make reference to subsidiary workbooks, which may beapplied when the parent workbook is applied. Each applied workbook orsubsidiary workbook may be controlled by supplied parameters. Theparameters may control the selection and application of each subsidiaryworkbook. Applied workbooks may reference subsidiary workbooks multipletimes with different parameter values and may make recursive selfreferences with different parameters. A family of workbooks with similarparameters may be grouped to form a virtual workbook that uses a set ofsupplied rules to select one member of the family when the virtualworkbook is applied.

Workbook parameters may be typed, and the type may limit the suppliedvalues. Workbook and data dependencies are tracked in order tofacilitate the maintenance of workbooks and data. The systems or methodsmay provide caching of intermediate computations across workbooks anddata sources and may distribute computations across multiple computers.Specific sets of cached results may be constantly maintained and madeavailable as a multidimensional data source (e.g. as time series data).

The details of one or more embodiments of the invention are set forth inthe accompanying drawings and the description below. Other features,objects, and advantages of the invention will be apparent from thedescription and drawings, and from the claims.

DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram of an illustrative embodiment of a complexmodel comprising parameterized workbooks in accordance with theinvention;

FIG. 2 is a block diagram of an illustrative embodiment of a virtualworkbook in accordance with the invention;

FIG. 3 is a block diagram of an illustrative embodiment of a system togenerate and use parameterized workbooks in accordance with theinvention;

FIG. 4 is a screen shot of an illustrative embodiment of a parameterizedworkbook in accordance with the invention; and

FIG. 5 is screen shot of an illustrative embodiment of a reportgenerated using a virtual workbook in accordance with the invention.

Like reference symbols in the various drawings indicate like elements.

DETAILED DESCRIPTION

FIG. 1 is a block diagram of an illustrative embodiment of a complexmodel 100 comprising a first parameterized workbook 105 a, andoptionally, a second parameterized workbook 105 b and an nthparameterized workbook 105 n. The parameterized workbooks 105 a, 105 b .. . 105 n are referred to generally as 105. The first parameterizedworkbook 105 a includes a parameter module 110, a workbook module 115and a results module 120. The parameter module 110 can include a list ofone or more parameters (e.g., external inputs). The results module 120can include a list of one or more results (e.g., outputs). Results 120can include, for example, a single value retrieved from a workbook cell,an array of values taken from a region of workbook cells, an HTMLrendering of a region (e.g. the print area) of a sheet from a workbookand/or an image or an alternative representation describing a chart. Thearrow 125 represents the association of a particular parameter 110 witha particular workbook 115. Similarly, the arrow 130 represents theassociation of a particular result with a particular workbook 115. Theparameterized workbook 105 is a basic building block and as illustrated,can call itself (represented by path 135) and/or a second parameterizedworkbook 105 b (represented by path 140) to model complex calculations.The modules throughout the specification can be implemented as asoftware program (e.g., a set and/or a sub-set of processor instructionsand the like) and/or a hardware device (e.g., ASIC, FPGA, processor,memory, storage device and the like).

A creator (e.g., user, administrator, system and/or automated process)associates the parameter module 110 and the results module 120 with theworkbook module 115. This association can facilitate the reuse ofworkbook calculations in different contexts. A system, for example asdescribed in FIG. 3 below, stores the lists of parameters 110 andresults 120, and the types of these parameters 110 and results 120, in astorage module 315 (FIG. 3), for example, a database. By associatingparameters 110 and results 120 with each workbook 115, a single workbook115 can perform a potentially unlimited number of calculations. Thisallows a parameterized workbook 105 to become a reusable computationalbuilding block in constructing a larger model 100.

The creator can also associate type information with each workbookparameter 110 and result 120. The type information constrains the set oflegal data values the system can use for a parameter 110 wheninstantiating a workbook and the set of legal data values that thesystem can return as an output result 120 from a workbook 105instantiation.

The parameter 110 and result 120 values of a parameterized workbook 105can be scalar types, such as strings and numbers, and/or elements ofstructured data types such as records or objects. Parameter 110 andresult 120 values can also be arrays of values derived from ranges ofcells, which need not be contiguous. In one embodiment, the typesassociated with the parameters 110 and results 120 are described by anobject-oriented type system that can represent objects with data fields,methods, inheritance and information hiding capabilities.

For example, a parameter can be a person type parameter. A person typeparameter can include attributes such as personal information, title,supervisor, and compensation information. The personal information caninclude the attributes First Name, Last Name, Social Security Number,Login ID, Password, Employee No., Work phone, Home phone and Date ofhire. The compensation information can include the attributes BaseSalary and Effective Date. For ease of use, the system can employ a userinterface for defining the attributes for specific employees of anorganization. For example, a user can input a value for the attributeTitle using a pull-down menu including the values, for example, SalesRep, District Manager, Regional Manager, Vice President-Sales, StrategicAccount Manager, Director-Strategic Accounts, CEO, and CommissionAdministrator. The Supervisor attribute can be a pointer to existinginstances of person type parameters.

The type system of the parameterized workbook 105 also captures datasource mapping information about how object and record types relate todata sources. Object or record types can optionally be mapped to thedatabase schema of a data source, which can be either an internal datasource (e.g., part of the system embodying the present invention) or anexternal data source. The data source mapping information is sufficientto allow the system to perform queries against the data source and toenumerate and access all instances of the object or record typeavailable from the data source. The type system can combine the datasource mapping information with the object or record type information toexpress arbitrary filters or restrictions on the set of object or recordtypes. The type system provides the ability to express data dependentconstraints on workbook parameters 110 and return values 120. In oneembodiment, the system includes built-in primitives for describingconstraints based on roles or hierarchies (e.g. Person where PersonHasRole SalesManager and Person.Territory Isln Europe).

The type information associated with each parameterized workbook 105 canbe created from a combination of user input and from a programmaticexamination of the internal structure of the computations. The systemcan employ type inference methods for determining type information basedon an analysis of the static structure of a computation (such as thestatic structure of a computer program) as is known in the art.

The system uses the type information to constrain the legalinstantiations of a parameterized workbook 105. For example, aparticular workbook 105 a might have two parameters 110. The firstparameter 110 is typed as the person object, further constrained to havethe job title of telesales representative. The second parameter 110 istyped as a time period object, further constrained to be a calendarmonth. The workbook 105 a might define several outputs 120. One output120 might be a bar chart indicating sales performance of the salesperson for a particular month (the output type might be specified as aGIF image). Another output might be a list of daily sales performancemetrics (such as the number of calls or average call duration or productrevenue generated). The type of this second output might be an array ofnumeric values (one for each day of the month).

The system can use parameter type information to avoid a common sourceof user errors when creating, modifying or accessing computations. Forexample, based on the available type information for the workbookdescribed above, the system can automatically generate a user interfaceto prompt the user for a telesales representative (chosen from a list ofall such people known to the system) and a month (based on the monthsfor which the system has data). The system can also use spreadsheetparameter information 110 for other purposes, including for example,data dependency tracking and pre-computation and caching of result sets120, as described below.

In one embodiment, the system receives an XML file to create aparameterized workbook 105. For example, the XML file can be defined asfollows:

<NVWorkbook name=″Rep_ProductLineCommissionCalculations”file=″/workbooks/Rep_ProductLineCommissionCalculations.xls”>  <paramdeftype=″Person″ name=″person″/>  <paramdef type=″Year″ name=″period″/> <attr name=″CreatedBy″ value=″Chris Thompson″/>  <attrname=″Description″ value=″Workbook calculates  earned commissions forall 12 months of a Year  for two product lines.″ />  <outputname=″YTDCommission″ cell=″Commission_YTD″> </NVWorkbook>

In this example, the file names the parameterized workbook 105“Rep_ProductLineCommissionCalculations.” In addition to the name, theparameterized workbook 105 has two additional attributes, “created by”and “description.” In one embodiment, the system can provide a userinterface that lists the respective values for these three attributesfor all of the parameterized workbooks 105 in the system. The file alsodefines the workbook module 115 used for this parameterized workbook105. This workbook is“/workbooks/Rep_ProductLineCommissionCalculations.xls.”

This example file names the parameters 110 associated with theparameterized workbook 105 Rep ProductLineCommissionCalculations“person” and “period” and defines them as “Person” and “Year” types,respectively. Person and Year have definitions in a data dictionary thatdescribes the object model of the application. In this example, theassociation 125 of the parameter 110 with the workbook 115 is made inthe workbook itself in addition to being described in the file. In otherwords, the spreadsheet Rep_ProductLineCommissionCalculations.xlscontains formulas that use the parameters person and period in them.When the system instantiates the parameterized workbook 105, the systembinds specific values for these parameters to the instantiation of theRep_ProductLineCommissionCalculati-ons.xls spreadsheet.

This example file names the output 120 “YTDConunission.” Otherparameterized workbooks 105 can use this name in their formulas and thesystem can find this value and pass this value to those other workbooks105. As described below, the system can store this value so that theother workbooks can receive this value without having to re-instantiatethe Rep_ProductLineCommissionCalculations workbook 105. In this example,the file defines the association 135 between the output 120 and theworkbook module 115. The file associates this output 120 with the cellnamed “Commission_YTD.”

An output such as “YTDCommission” also defines a family of relatedoutput values based on all possible combinations of input parameters tothe parameterized workbook 105. Sets or subsets of these families ofoutput values can be used as multi-dimensional data cubes for analyzingtrends and data relationships as is known in the art. An example of asubset of values derived from this example workbook is the list ofYTDCommission values for all sales managers in Europe for the year 2000.

In one embodiment, the creator associates the parameter module 110 andthe results module 120 with a workbook module 115 that includes aworkbook (which can contain multiple worksheets). A workbook is usedbecause the workbook is a convenient unit of spreadsheet computation,easily identifiable and manageable by the user. The system, however,does not depend on certain properties of workbooks manifested in currentspreadsheet programs, such as the workbook being the unit of filestorage and the unit of transfer when moving a spreadsheet computationinto main memory for processing. In other embodiments, the workbookmodule 115 can include any logical unit of spreadsheet computation,comprising one or more spreadsheets and associated formulas, as the unitof parameterization. Therefore, although workbook is used for clarity,the term workbook can be substituted with this logical unit ofspreadsheet computation throughout the specification.

With parameterized workbooks 105, a user does not have to vary cellvalues in order to perform different calculations, including what-ifscenarios. Using parameterized workbooks 105, a system can perform thesevariable calculations non-interactively (e.g., without intervention byan user). As described in more detail below, a system can perform avarying parameterized workbook 105 calculation on a client computer oron one or more server computers. An end user, a formula appearing in aworkbook cell, and/or a program using an API can initiate aparameterized workbook 105 calculation either interactively ornon-interactively. Other computer systems communicating over a networkcan also initiate a parameterized workbook 105 calculation. These othersystems may use established communication protocols such as, forexample, CORBA (Common Object Request Broker, from OMG (ObjectManagement Group)), RMI (Java Remote Method Invocation) or SOAP (SimpleObject Access Protocol, from Microsoft). A parameterized workbook 105calculation may be initiated in response to a user request from a webbrowser or another user interface device include cell phones, personaldigital assistants, etc.

In one embodiment, the creator of the parameterized spreadsheet 105limits the variable calculations to a subset. Parameterized spreadsheet105 calculations may limit the variations to a list of specificparameters 110 and to a potentially limited set of possible values foreach parameter 110.

The output 120 of a parameterized spreadsheet 105 calculation can takemany forms. The software performing the parameterized spreadsheet 105calculation can format a worksheet or a region on a worksheet fordisplay on some output device. The display format can vary depending onthe output device, and can include standardized output formats such asHTML (Hypertext Markup Language, the main document format recognized byweb browsers), WML (Wireless Markup Language, similar to HTML buttargeted at wireless devices such as cell phones), or XML (extensibleMarkup Language, used for business to business (or system to system)communication), in addition to device specific formats. Alternatively,the output 120 of a parameterized spreadsheet 105 calculation can be aset of data values, suitable for use in further data processing. Variousoutput formats can be used for representing these sets of data values120, including document formats such as HTML or XML or formats based ondata communication protocols such as CORSA or RMI. Parameterizedspreadsheets 105 allow greater control over the course of thecomputation, including the ability to select different sets of externaldata upon which to operate.

Parameterized spreadsheets 105 allow computations to be performed andallow those computations to be influenced by external data sourcesthrough parameters, data ranges, and other mechanisms as describedbelow. They provide increased flexibility to the user, allowingcomputations to use varying inputs and allowing the output desired to bespecified by the user without having to change the underlyingspreadsheet 115. Introducing parameterized spreadsheets 105 as a formalmodeling mechanism provides an interface suitable for non-programmers tocreate reusable spreadsheet-based computational building blocks and itprovides programmers with increased flexibility and power.

When a parameterized workbook 105 calculation is performed in thecontext of a specific set of parameters 110, this can be referred to asa workbook instantiation. As illustrated with path 135, a model 100 cansimultaneously instantiate a parameterized workbook 105 a multiple timeswith different values for the parameters 110. This allows aparameterized workbook 105 calculation to depend on one or morecalculations from subsidiary parameterized workbooks (e.g., 105 a, 105 b. . . 105 n). Any subset of the subsidiary instantiations, and even thereferencing workbook instantiation itself, can be instantiations of thesame parameterized workbook 105.

The combination of parameterized spreadsheets 105 and simultaneousinstantiation allow many business problems to be modeled naturally bythe end user. For example, if a sales manager is paid based on theperformance of the salespeople who report to him, and if eachsalesperson's performance is determined by a parameterized spreadsheet105 calculation, then the manager's performance can be modeled 100 by aparameterized spreadsheet 105 a that depends upon values calculated bysubsidiary workbook instantiations (e.g., 105 b . . . 105 n). The numberof dependent instantiations and the parameters 110 used in eachinstantiation are controlled by the formulas in the referencing workbook115, perhaps for example, based on a database representation of theorganizational structure of the company. As described in more detailbelow, FIG. 4 illustrates an exemplary embodiment of a screenshot of aparameterized workbook 105.

In addition to the parameterized workbook 105, FIG. 2 illustrates anexemplary embodiment 200 of another building block for use in complexmodeling, the virtual workbook 205. The virtual workbook 205 includes aparameter module 110 ′, a results module 120 ′, a virtual workbookmodule 210 and a selection module 215. In one embodiment, the selectionmodule 215 can be included in and/or as part of the virtual workbookmodule 210. An instantiation of the virtual workbook 205 selects aconcrete workbook (e.g., 105 a′. . . 105 n′) to instantiate based on thevalues of the supplied parameters 110′ and a set of rules 215 associatedwith the virtual workbook 205. The virtual workbook 205 mechanisms aresimilar to programming language polymorphism mechanisms, such as virtualmethods or function overloading, that are a part of programminglanguages such as Java or C++.

As described above, a parameterized workbook 105 a (FIG. 1) creates aset of outputs 120 (FIG. 1) based on a set of input parameters 110 (FIG.1). In other words, the parameterized workbook 105 a is a functionmapping a set of inputs 110 to a set of outputs 120. The outputs 120 maybe visual, such as a chart or a report rendered in some format (e.g.GIF, HTML) intended for display or printing. The outputs 120 may also bedata values represented in some format (e.g. XML) intended for furtherprocessing.

A virtual (polymorphic) workbook 205 maps a set of inputs 110′ to a setof outputs 120′ by selecting a concrete parameterized workbook (e.g.,105 a′ . . . 105 n′, generally 105′) from among a set 220 of compatibleworkbooks and then instantiating the concrete workbook 105′. Theselection module 215 selects a concrete workbook 105′ based on the valueof the parameters 110′ the virtual workbook 205 receives and a set ofrules 215 associated with the virtual workbook 205. As an example, avirtual workbook 205 can represent monthly commission payments to salesemployees. Different kinds of employees might have different sales plansdepending on their job title, seniority, territory, and the like, andthe parameterized workbook 105 for different employees might bedifferent for different months, for example, because they were promoted.The selection module 215 selects a concrete parameterized workbook 105′based on received values of parameters 110′ corresponding to job title,seniority, territory, month, and the like. The virtual workbook 205provides a uniform mechanism of determining any employee's commissionpayment for any particular month, while the underlying set of concreteparameterized workbooks 220 allow for the differences in the actualcalculations for the different employees or different months.

In one embodiment, virtual workbooks 205 (as well as concrete workbooks105) are associated with uniform resource locators (URLs), which arepart of the naming scheme used in the World Wide Web. The particularnaming scheme used to refer to workbooks 105, 205 does not matter,however, since the virtual workbooks 205 themselves provide themechanism for mapping a generic request to a specific workbook 105′ thatimplements that request for a given set of parameters 110′.

A model can employ a virtual workbook 205 in any context where aparameterized workbook 105 can be used. For example, a virtual workbook205 may be referenced by formulas within other workbooks (this referencemay be based on the URL associated with the workbook or it may be basedon a different naming scheme not based on URLs). As a result, aparameterized workbook 105′ may have a dependency on a virtual workbook205 that is in turn implemented by one or more concrete workbooks 105′,perhaps including the referencing workbook 105′ itself. This mechanismallows the formulas in the referencing workbook 105′ to refertransparently to one of several concrete workbooks 105′ that the virtualworkbook 205 selects based on the value of the parameters 110′ thevirtual workbook 205 receives. A user or the system can add and/orchange over time the set of rules 215 for selecting a concrete workbook105′ and the set 220 of concrete workbooks. These changes aretransparent to a referencing (calling) workbook and so the user orsystem does not need to change the formulas within the referencingworkbook. This transparency makes a virtual workbook 205 a powerfulbuilding block to allow end users to create and manage large, flexiblespreadsheet-based computations.

Similar to the parameterized workbook 105, in one embodiment, the systemreceives an XML file to create a virtual workbook 205. For example, theXML file can be defined as follows:

<NVQuantity name=“CommissionEarnedYTD”  type=“ByDecisionTree”> <paramdef type=“Person” name=“person”/>  <paramdef type=“Year”name=“period”/>  <paramdef type=“MonthIndex” name=“monthindex”/> <NVDecisionTree> <NVRule>  <NVRulet parameter_name=“person”><NVCondition type=“attr” attr=“Position”  value=“Sales Rep”/><NVCondition type=“attr” attr=“Position”  value=“District Manager”/><NVCondition type=“attr” attr=“Position”  value=“Regional Manager”/> </NVRulet>  <NVQuantityImpl type=“ByCellName” valueType=“Real”> <NVCellname=“Commission_YTD” workbook=“Rep_ProductLineCommissionCalculations”/> </NVQuantityImpl> </NVRule> <NVRule>  <NVRulet parameter_name=“person”><NVCondition type=“attr” attr=“Position”  value=“Strategic AccountManager”/>  </NVRulet>  <NVQuantityImpl type=“ByCellName”valueType=“Real”> <NVCell name=“Commission_YTD”workbook=“SAM_ProductLineCommissionCalculations”/>  </NVQuantityImpl></NVRule>  </NVDecisionTree> </NVQuantity>

The file in this example names this virtual workbook 205“CommissionEarnedYTD” and defines the type of the virtual workbook 205as “ByDecisionTree.” As defined in this example, the selection module215 uses the two rules of the decision tree to select the appropriateconcrete parameterized workbook 105′. Both rules examine the parameter110′ named “person” and the value of its attribute “position.” Accordingto the first rule, if the value of the position attribute is sales rep,district manager or regional manager, the selection module 215 selectsthe parameterized workbook 105′ Rep_ProductLineCommissionCalculations.The associated output 120′ returned after this selection is the valuefrom the cell in the selected workbook named Commission_YTD. Accordingto the second rule, if the value of the position attribute is strategicaccount manager, the selection module 215 selects the parameterizedworkbook 105′ SAM_ProductLineCommissionCalculations. The associatedoutput 120′ returned after this selection is the value from the cell inthe selected workbook named Commission_YTD.

Although the exemplary embodiment above uses two rules and a decisiontree, it is understood that the system can employ other selection rulesand mechanisms to select a concrete workbook 105′ from the set 220.

In broad overview, FIG. 3 illustrates an embodiment of a system 300 togenerate and use parameterized 105 and virtual 205 workbooks inaccordance with the invention. The system 300 includes an applicationserver node 305, a calculation module 310, a storage module 315 and aclient node 320. The system 300 can optionally include models 325 andmiscellaneous Web services 330.

The application server node 305, the calculation module 310 and thestorage module 315 communicate with each other via a network 340 a usingcommunication channels 345 a, 345 b, and 345 c, respectively. The clientnode 320 communicates with the application server node 305 via a network340 b using communication channels 335 d and 335 e, respectively. Thenetworks 340 a and 340 b can also be part of the same network 340 asshown with optional connection 340 c. If optional connection 340 c isincluded, then one of the application server node's 305 communicationchannels 345 a or 345 e can be eliminated. If the models 325 and themiscellaneous Web services 330 are included in the system 300, theycommunicate with the network 340 b using communication channels 335 fand 335 g, respectively. The networks 340 a, 340 b and 340 c aregenerally referred to as 340. The communication channels 345 a, 345 b,345 c, 345 d, 345 e, 345 f and 345 g are generally referred to as 345.

For example, the networks 340 and the communication channels 345 caninclude and/or be part of an internal bus, a local-area network (LAN),such as a company Intranet, a wide area network (WAN) such as theInternet or the World Wide Web and/or the like. The networks 340 and thecommunication channels 345 represent, for example, standard telephonelines, LAN or WAN links (e.g., T1, T3, 56 kb, X.25), broadbandconnections (ISDN, Frame Relay, ATM), wireless connections (cellular,WLAN, WWAN, 802.11) and/or the like. The connections can be establishedand data can be exchanged using a variety of communication protocols andlanguages (e.g., HTTP(S), TCP/TP, SSL, PPTP, HTML, XML, SOAP, IPX, SPX,NetBIOS, Ethernet, RS232, direct asynchronous connections, VPNprotocols, a proprietary protocol, a proprietary language and/or thelike). In one embodiment, the servers 305, 310 and 315 and the client320 encrypt all communication when communicating with each other.

Each of the servers and modules 305, 310 and 315 can be any computingdevice capable of providing the services requested by the other serversand modules or by the client node 320. Particularly, this includesgenerating and processing parameterized workbooks as described herein.For clarity, FIG. 3 depicts server node 305 as an entity separate anddistinct from modules 310 and 315, and each node 305, 310 and 315 canindependently be in communication with the network 340 a, representingthat the nodes 305, 310 and 315 are logically independent. It is to beunderstood, however, that the nodes 305, 310 and 315 can also beimplemented individually or in any combination, for example, on a singleserver, distributed on portions of several (i.e., more than two)servers, and/or as part of a single server node or server farm incommunication with the network 340 b through, for example, a single Webserver (not shown).

The client node 315 can be any computing device (e.g., a personalcomputer, set top box, wireless mobile phone, handheld device, personaldigital assistant, kiosk, etc) used to provide a user interface toaccess the application server 305. The client node 320 can include abrowser module 350.

A user can use the system 300, inter alia, to generate and processparameterized 105 and virtual 205 workbooks. Two objectives of thepresent invention are to simplify the creation and maintenance of largespreadsheet-based computation models and to allow those models to becomputed without manual intervention. The system 300 facilitates theseobjectives by storing tracking information about workbooks, includingtheir parameters 110, parameter types and dependencies in the storagemodule 315. In one embodiment, the system 300 stores this information asa workbook information database. The workbook information databasetracks and controls the names and storage locations of all parameterized105 and virtual 205 workbooks used in the system 300. The databaserecords information about the parameters 110 and parameter types andoutputs 120 and output types of each workbook (e.g., 105, 205) and thedependencies between workbooks that exist when one workbook refers toanother workbook using a formula. Each workbook instantiation may dependon one or more instantiations of the same workbook or of otherworkbooks. The dependency information allows the system 300 to performworkbook computations without manual intervention, for example in anapplication server 305 environment, because the system 300 can determineall dependent workbooks, open them, instantiate them and resolve alllinkages between the workbooks appropriately. Table 1 represents anembodiment of parameter and where-used data stored in the workbookinformation database.

TABLE 1 Workbook: Name: DM_Sales_Rank Location:/workbooks/DM_Top_Performers.xls Associated Parameters: Name: Type:Restriction: Person Class Person none Period Class Period type = YearMonthIndex Enumeration[1 . . . 12] none Associated Outputs: Name: Type:Cell: TopRank Number MaximumRank Dependencies: Dependency Type:Description: Workbook Output CommissionCalculations, outputYTDCommission Data Query PeopleByManager

As shown in Table 1, the database stores the name, the type and anyrestrictions of the parameters 110 associated with the representedworkbook. The names of parameters associated with the representedworkbook are person, period and monthindex. Each parameter has a definedtype. Class person and class period are class types that define theattributes associated with the respective parameter. Enumeration is anindex type that associates an index with a corresponding calendar month.The restrictions data notes any restrictions on the parametersassociated with this workbook and in this example, there is arestriction on the value of the parameter period to be stated as a year.

In the Table 1 example, the database also stores the name, the type andreference information on all defined outputs of the representedworkbook. The names of the defined output shown is TopRank, which ismapped to a cell named MaximumRank within the workbook.

In the Table 1 example, the database also stored data on dependencies ofthe workbook on other workbooks or external data sources. There are twodependencies shown. One dependency shows that the represented workbookcalculation depends on the output of a calculation from anotherworkbook, the YTDCommissions output of the CommissionCalculationsworkbook. The second dependency indicates that the represented workbookuses a data range that depends on a named data query PeopleByManager.

In one embodiment, the workbook information database also tracksmultiple versions of workbooks (e.g., 105, 205) as they are created andmodified over time. This version information provides a historicalrecord of changes, allows changes to be rolled back, and can provide forstability of prior computations (changes to a workbook do notnecessarily have to affect existing computations, which can continue touse the previous version of the workbook at the user's discretion). Thesystem 300 can also use version tracking of workbooks to limit thevisibility of certain workbook versions to particular people. Forexample, a new workbook version might be visible only to a set of“approvers” as part of a formal approval process. Once the new versionhas been approved, it is then made visible globally and affects thecomputations seen by everyone. Table 2 represents an embodiment ofversion data stored in the workbook information database.

TABLE 2 Versions: Branch: Version: State: Effective: Labels: Main 1Published Anytime None Main 2 Unpublished Aug. 01, 2002 None

The Table 2 , example stores the version data for the representedworkbook, including the branch, the version number the state of theversion, the effective date of the version-and any associated labels ofthe version. The branch values represent named branches within theversion graph of the workbook. The version number is an ID for eachversion relative to its branch. Thus, a particular workbook version isuniquely identified by a combination of its branch name and versionnumber. The state values represent whether the cached data associatedwith that workbook version has been published (e.g., available to usersto view and/or edit). The effective value represents the time frame inwhich the version is valid. The labels values represent user-definednames that can be assigned to individual workbook versions.

The workbook information database also tracks information about the datasources used by each workbook (e.g., 105, 205). This information is usedto detect when workbook results 120 are out of date and to efficientlyrecalculate the workbooks. For every reference in a workbook to externaldata, the system 300 records information in the workbook informationdatabase that represents a predicate describing the set of external dataupon which the workbook computation depends. This predicate may bebased, in part, on the values of workbook parameters 110. In otherwords, the results of different instantiations of the workbook maydepend upon different sets of external data.

The system 300 may monitor each data source to detect changes to thedata source, using the information in the workbook information databaseto determine which workbook instantiations arc affected by any change.Several different means may be used to monitor the data source.

Data sources can include both internal and external data sources. Aninternal data source is a database (or other data storage mechanism)that is accessed only through the system 300. All changes to data in aninternal data source are made through interfaces provided by the system300 and these update interfaces can perform the necessary monitoring ofthe data source. An external data source is one that is not under theexclusive control of the system 300 and can be updated through otherinterfaces. For external data sources, the system 300 can use knowntechniques such as database triggers, database polling and/or datatimestamps to provide similar levels of monitoring functionality. In thecase of external data sources, the application server 105 is the activeagent responsible for monitoring the external data source. The system300 can use mechanisms used to monitor external data sources to monitorinternal data sources as well.

By analyzing the computations and data dependencies within each workbook(e.g., 105, 205), the system 300 can in many cases optimize the datasource monitoring required. For example, a particular workbook (e.g.,105, 205) may have a parameter 110 that is a time period (perhaps acalendar month). Analysis of the workbook computation may reveal that aninstantiation of the workbook only depends on data for the monthspecified by the parameter. The data source monitoring subsystem canthen optimize by remembering which months have had data updates and onlyrecalculating workbook instantiations that depend on those months. Ifdata updates to prior months are rare, this avoids many unnecessarycalculations when referring to historical data.

An alternative or additional method the system 300 can use tracks datasource dependencies and dependencies of one workbook instantiation onother workbook instantiations by observing and recording the datarequests made by the referencing workbook and the results of these datarequests. The system 300 compares this recorded data against newlycomputed data in the future to determine whether a workbookinstantiation needs to be recalculated.

A third method the system 300 can use allows the user to supplysurrogate tests for determining when a workbook instantiation needs tobe recalculated. For example, the user may specify that instantiationsof particular workbooks for prior periods should only be recalculatedwhen specifically requested by the user and should be treated as up todate in all other situations.

In the system 300, a server processing component (not shown) or anauxiliary agent processing component (not shown) can perform data sourcemonitoring. In one embodiment, these components are always active (orare started on demand by other components, such as a database server,which is always active). This allows the system 300 to monitor datasource changes without the need for human intervention, unlike existingspreadsheet applications.

The system 300 supports distributing the computational load ofevaluating a spreadsheet model (e.g., 100, FIG. 1) across multipleservers (i.e., calculation module 110 represents workbook calculationson multiple servers). By utilizing the workbook dependencies and datasource dependencies stored in the storage module 115, as well as otherworkbook database information, the system 300 can schedule the executionof different workbook instantiations on different server computers.

Exploiting this parallelism improves the speed with which large modelscan be calculated, either for on-demand calculations or for automaticpre-calculations. Without the ability to perform true distributedprocessing, computations involving large networks of interconnectedspreadsheets can be impractical. Complex models using parameterized 105and virtual 205 workbooks typically exhibit a large amount of potentialparallelism.

Distributed processing of workbook computations depends on the abilityof the system 300 to separate the calculation of a referenced workbookfrom the use of a result from the referenced workbook by the referencingworkbook. In one embodiment, the system 300 accomplishes this byutilizing special formula primitives to allow one workbook to refer to aresult of another workbook, although other mechanisms are possible.

To perform the distributed workbook calculation component, the system300 can utilize dependency information recorded in the database forefficiency, but does not need to. The system 300 can also scheduledistributed executions of workbook computations dynamically as thecomputation proceeds. For example, if one workbook (e.g., 105, 205)references outputs 120 from a number of subsidiary workbooks (e.g., 105b . . . 105 n), the computation of the subsidiary workbooks can proceedin parallel as long as the parameter 110 values for one instantiation donot depend on the output 120 of a different subsidiary workbookinstantiation.

In addition to distributed computing, the system also utilizes cachingalgorithms and a database of cached computation results to optimize theperformance of evaluating large computational models, especially modelscreated as a network of interrelated workbooks. The system 300 storesresults 120 and/or other intermediate calculations in the storage module115 so that these values are not only available after the workbook filehas been loaded into the memory of the spreadsheet program and opened.By caching the results in, for example a database, the system 300 canprovide faster access and improve the performance of multi-workbookcomputations. The system 300 may cache values for many differentinstantiations of the same workbook with different parameter values. Thesystem 300 can also distinguish result values of workbooks that are thevalues of intermediate computations from subsidiary workbooks performedin the process of instantiating some other workbook and may also cachethese result values in the database. These result values or aggregationsof these result values, which may come from many different workbooks andworkbook instantiations, can then be queried and manipulated usingstandard database tools. These result values, whether cached or not, canalso be accessed through application programming interfaces provided bythe system 300, including network-based interfaces such as web services.

The system can perform workbook calculations either on demand (e.g. whena particular result is requested by an end user through a web browser orsome other user interface) or automatically by pre-calculating andcaching the results of the pre-calculations, using for example a processon the server 305, before they are requested. Automatic pre-calculationimproves the interactive performance of the system 300 when userssubsequently request these computations. The system 300 can performthese automatic pre-calculations at times when the servers 305, 310, 315would otherwise be idle.

The system 300 uses information about workbook parameter types andworkbook dependencies to determine the set of computations topre-calculate. Parameter type information determines the set ofallowable values for each parameter, and thus the set of all possibleinstantiations that may be requested for a particular workbook. The setof allowable values for a parameter may depend not only on the parametertype, but also on the state of some data source. So, for example, if anew user is added, then some workbook parameters may now have additionallegal values. The data source monitoring of the system 300 detects thischange to the data source and may pre-calculate and cache new resultsfor the newly possible instantiations of these workbooks.

The system can perform automatic pre-calculation and caching forintermediate results (e.g. the output of one workbook instantiation,which is used by another workbook instantiation), for results that maybe displayed for the user, and/or for a set of aggregated result valuesthat are cached so that they can be efficiently queried. Aggregatedresult values might include time series or multi-dimensional data, suchas daily gross profit broken down by region and product line.

The system 300 can use several different factors to decide which valuesshould be precalculated and cached. One factor, for example, isinformation available within the system 300 about dependencies betweenworkbooks and the size of individual computations. As described above,this information can be stored in the storage module 315, in for examplea database. The system 300 uses this information to estimate thepotential cost savings for pre-calculating particular computations.Another factor, for example, is historical access patterns and frequencyof data accesses (e.g., including web page hits), which providespredictive information about the demand for particular data values orworkbook instantiations in the future. A third factor, for example, isuser specified instructions about which workbook instantiations topre-calculate and which data values (including aggregated result values)to cache.

FIG. 4 illustrates an embodiment of a screen shot 400 of a specificexample of parameterized workbook 105 (FIG. 1) that the system 300 (FIG.3) can generate and process in accordance with the invention. Thescreenshot 400 includes a first portion 405 indicating the name of theworkbook containing the associated spreadsheet formulas and a secondportion 415 indicating an exemplary formula used for producingdescriptive text based on the workbook parameters 110 and a thirdportion 420 showing the resulting descriptive text displayed in a cell.The screenshot 400 also includes a fourth portion 425 illustratingtabular data calculated using a data range. The table 425 includes aprototype range 430 and five columns 435 a, 435 b, 435 c, 435 d and 435f. The screenshot 400 also includes a fifth portion 450 illustratingresults 120 in a graphical format.

In this example, the name of the parameterized workbook 105 representedby this screenshot 400 is DM_Sales_Rank, representing a ranking of allpeople reporting to a particular district manager who is identified by aperson parameter of the workbook. To generate this parameterizedworkbook 105, the system 300 instantiates workbook module 115 namedDM_Top_Performers.xls, as indicated by the first portion 405.

The equation in the second portion 415 uses the parameters 110 “person”and “period” as part of the formula. When the system 300 instantiatesthe workbook, the system 300 binds the value of the parameters personand period sent with the call to generate that instance. The result ofthe formula is a descriptive text string based on the workbook parametervalues, shown in a cell in the third portion 420.

In this example, the system 300 generates the table 425 using a datarange. A data range, as used herein, is a spreadsheet extension thatassociates a data query with a variable sized set of workbook cells. Thefeatures of the data range of the system 300 provide facilities forperforming automated, data-driven spreadsheet calculations based onvariable input data.

A data range of the system 300 associates a parameterized data querywith a range of workbook cells. Each data query parameter is identifiedby name and/or position. Each query parameter has a type that constrainsthe allowable values for the parameter. Queries are expressed in atextual syntax that a user and/or administrator creates. A data rangecan also refer by name to a pre-defined query. A data range defines aset of query parameter value expressions using the spreadsheet formulalanguage. There is one expression for each required query parameter inthe query. The system 300 uses the results of these expressions as thevalues of the query parameters when the system 300 evaluates the datarange query.

In general overview, a data range can include a prototype range, areplication direction, an output range, an output range name, a sortingspecification and a row insertion specification. The prototype rangespecifies a set of “template” cells that the system 300 replicates oncefor each set of results (or row) the query returns. The prototype rangeis usually one cell high (or alternatively one cell wide), but it canalso be rectangular. The replication direction indicates whether thecells from the prototype range are replicated downwards or to the right.The output range is the set of cells that are populated after the system300 evaluates the data range by replicating the formulas from theprototype range once for each set of results (or row) from the datarange query. The output range is of variable size, depending on thequery results, which in turn depend on the values of the queryparameters, which in turn usually depend, directly or indirectly, on theparameters to the workbook containing the data range. The creator of adata range does not provide formulas for cells in the output range, theformulas for these cells and the resulting cell values are computed bythe process of replicating the formulas from the prototype range. Whenthe prototype range is replicated, formatting information is replicatedas well as cell formulas. Therefore, the prototype range controls thevisual presentation of the data range as well as the calculated values.

The system 300 binds the output range name to the output range of thedata range after each time the data range is modified or re-evaluated.This mechanism facilitates writing formulas that extract selected datafrom the data range without depending on the size of the data range,because formulas can refer to the range through this name rather thanreferencing a fixed set of cells that can change over time. The outputrange name is optional. The sorting specification lists one or morecolumns (or rows) of the data range and for each column whether thecolumn should be sorted in ascending or descending order. The sortingtakes place after the system 300 evaluates the formulas for each cell inthe output range and is based on the result shown in each sorted cell ofthe output range. This post-evaluation sorting is independent of, and inaddition to, any pre-evaluation sorting performed by the queryassociated with the data range. The sorting specification is optional.The row insertion specification controls the behavior when the outputrange grows and shrinks. The row insertion specification determineswhether cells are either (a) inserted and deleted as the range grows andshrinks or (b) overwritten and cleared as the range grows and shrinks.In addition, in case (a), the row insertion specification indicateswhether the entire set of rows or columns outside of the data range isreplicated when the data range is evaluated or just the cells within thedata range itself.

In other embodiments, the data range can include other aspects. Theseother aspects can include options for handling special cases, such aswhen the query returns zero rows or one row. For example, when a queryreturns zero rows, the prototype range can be hidden (so that itoccupies no space in rendered output) or the prototype range can displayas blank fields independent of the values produced by the formulas inthe prototype range. Other aspects can include options for formattingthe data range as a whole, such as borders to apply to the entire datarange or striping effects created by alternating the background colorson each row of the output range as shown in the table 425 in FIG. 4.Other aspects can also include options for special rendering of the datarange in particular output formats. For example, the data range mightrender in HTML in a manner that allows the user to sort or pivot thedata range or otherwise interact with and/or change the data range orhow it is displayed in a web browser 350.

In general overview, to evaluate a data range, the system 300 evaluateseach parameter value expression for the data range query. The system 300then evaluates the query itself. The query expression may depend on theparameter value expressions, which may in turn depend on other cells inthe workbook and (directly or indirectly) on the workbook parametervalues. The query evaluation returns a set of rows, where each rowcontains multiple values. Each value in a row is identified by a name(the same names are used for the corresponding values in each row). Thevalues may have a variety of data types, including, for example,numbers, strings, dates and references to complex objects.

For each row the query returns, the system 300 replicates the prototyperange, either downward or to the right, depending on the replicationdirection. To replicate a cell, the system 300 evaluates the formulafrom the prototype cell in the context of the new cell, with the samebehavior as if the user had copied the cell. In addition, formulaswithin a data range can refer to the named values of the current row byusing the names as defined workbook names for the duration of theevaluation of the formula. When replicating a cell in a data range, thesystem 300 also replicates the formatting associated with the cell.

Referring back to the FIG. 4 example, the data range for the table 425is associated with a named query, for example, a PeopleByManager query,which is a predefined query taking as parameters a person and a timeperiod. The parameter expressions for this data range (not shown) simplypass the corresponding workbook parameters as the parameters values forthe query. This query returns a set of rows with two values in each row,an “EMPLOYEE” value and a “POSITION” value. The “EMPLOYEE” value is aunique identifier for a complex person object reporting to this manager,which is Abraham Lincoln as indicated in portion 420. The “POSITION”value is a string indicating the person's position within theorganization (e.g. “Sales Rep”). The prototype range 430 consists ofcells A19 to F19, which are formatted to be green in this example. CellA19 contains the formula =PersonName(EMPLOYEE). This formula uses asystem accessor function (PersonName) to return the full name of theemployee object in each row. The resulting names fill column 435 awithin the data range. Cell B19 contains the formula=NVQuantity(“Quota”, EMPLOYEE, Year(period)). This formula uses a systemspreadsheet extension (NVQuantity) to return a calculated value (i.e.,the current employee's quota) for the year containing the periodparameter to the workbook itself. In this case, the system 300 retrievesthe Quota quantity by retrieving a value from a database. It isnoteworthy that this formula refers to both a workbook parameter (i.e.,period) and a named value in the row (i.e., EMPLOYEE). The system 300displays the corresponding value for each selected employee within thedata range in column 435 b. Cell E19 contains the formula=NVQuantity(“CommissionEarnedYTD”, EMPLOYEE, period). This formulareferences a value calculated, in this case, as the output of anotherparameterized workbook 105. The system 300 displays the computed valuefor each selected person within the data range in column 425 e. Theformula in cell F19 contains the expression =E19/C19. This expressioncalculates the ratio between the person's year to date commission andtheir total sales. This formula uses relative references to perform thiscalculation. As the system replicates this formula in the output range,the system preserves the relative references so that the system 300calculates the ratio correctly in each row.

It is understood that a creator of a workbook can reference the systemformula language extensions (e.g., PersonName and NVQuantity) in anycell in a parameterized 105 or virtual 205 workbook, not only in cellswithin a data range. The process of evaluating a data range or theprocess of evaluating a formula containing a system formula languageextension may cause the system 300 to perform further processing. Forexample, a reference to a quantity such as CommissionEarnedYTD, which isimplemented as the output 120 of another workbook (e.g., 105, 205), maycause the system 300 to instantiate other workbooks (recursively) as aconsequence of instantiating the original workbook.

When an output 120 from another workbook (e.g., 105, 205) is needed, thesystem 300 instantiates the workbook unless a cached value for therequired output 120 is available and up to date. The system 300 searchesthe storage module 315 to determine if such a cached value is available.If so, then the system determines whether any dependencies of the cachedvalue have changed since the value was cached. Dependencies of thecached value include the workbook that calculated the value, any queriesor external references, such as NVQuantity expressions made during theinstantiation of the workbook that produced the cached value, and anyoutputs from other workbooks that were consumed by the workbook thatcalculated the value. The system 300 can use various methods tocircumvent full dependency checks, including, for example, userspecified rules defining how often to update cached values.

If a given quantity has no cached value or the cached value is not up todate, then the system 300 instantiates the target workbook. After thetarget workbook is instantiated, the system 300 retrieves the designatedoutput 120 from the workbook instantiation and then caches the updatedresult if caching is enabled for the specified system quantity. Aworkbook output 120 may be specified by cell coordinates (e.g. Sheet“Summary”, cell B10) or more typically by referring to a named workbookquantity within the workbook (e.g. “YTDCommission”). The named quantitytypically refers to a predefined cell within the workbook that containsthe desired output value, but it may also refer to a range of cells orother kinds of system formula language expressions. Each workbook mayspecify several different outputs, and the system cache maintenancemodule may choose to cache multiple outputs from the workbookinstantiation even though only one output from the instantiation wasoriginally requested. The system 300 may need these other cached outputvalues for other system calculations.

In the example, the data range in the DM_Top_Performers workbookrequests year-to-date commission numbers for each employee reporting tothe specified manager (i.e., Abraham Lincoln). This results in avariable number of workbook instantiations. The system may find some ofthe requested outputs cached in the storage module 315 and may have torecompute others. The creator of the CommissionEarnedYTD quantity canimplement that quantity by a set of virtual workbook rules that selectsdifferent workbooks for different sets of input parameters. In theexample, the system 300 may calculate commissions for sales reps by oneworkbook while calculating commissions for district managers by adifferent workbook. For each workbook that needs to be instantiated, theapplication server 305 schedules the calculation of the workbook. Thecalculation of workbooks may be spread over multiple calculation serverprocesses 310 running on multiple machines.

To further improve interactive performance, it is often useful toprecalculate sets of system quantities that are frequently referenced.For example, it can be useful to precalculate some quantities at nightafter daily sales transactions have been posted. The system 300 utilizesthe type system to guide precalculation. For example, theCommissionEarnedYTD quantity has a person parameter and a monthparameter. It is possible to run an automated precalculation of thisquantity for all people (or alternatively for Sales Reps in Europe) forthe current month.

For example, as described above, cell C19 of the table 425 contains theformula =NVQuantity(“SalesCredits”, EMPLOYEE, period). The system canprecalculate and cache the sales credits for each employee for eachyear. The entries for this quantity are shown in Table 3. As shown inTable 3, the system 300 stores the data using the ID attribute of aperson type parameter.

TABLE 3 Period ID 1998 1999 2000 2001 2002 nfasth 1000000 15000001700000 1200000 1200000 acabrera 1200000 1900000 2200000 1500000 1500000jolazabal 1900000 2900000 3400000 2400000 2400000 ascott 1200000 19000002200000 1500000 1500000 jhaas 2200000 3300000 3900000 2800000 2800000sflesch 1500000 2300000 2600000 1900000 1900000 checkman 500000 800000900000 700000 700000 bmay 1800000 2600000 3100000 2200000 2200000ksutherland 1100000 1700000 2000000 1400000 1400000 sstricker 15000002300000 2600000 1900000 1900000 fcouples 1900000 2800000 3300000 23000002300000 sames 1900000 2800000 3300000 2300000 2300000 xcarter 17000002500000 2900000 2100000 2100000

FIG. 5 illustrates an embodiment of a screen shot 500 of a specificexample of virtual workbook 205 (FIG. 2) that the system 300 (FIG. 3)can generate and process in accordance with the invention. Using thenetwork browser module 350 (FIG. 3) (e.g., a web browser) on a clientdevice 320 (FIG. 3), a manager (e.g., Harvey Mackay 505) can log in tothe application server 305 (FIG. 3) and request a summary report 510,including a table portion 512. The manager typically specifies theperiod of interest 515, or, in one embodiment, the period defaults tothe current year. A virtual workbook 205 implements the report 510. Avirtual workbook 205 selects a particular concrete workbook 105′ basedon rules associated with the definition of the virtual workbook. Therules are based on the parameters of interest, in this case, the manager505 and the period 515 for which the report is requested. The rules canalso utilize globally available information, such as the identity of theuser who is logged in, the current date, and the like. The applicationserver 305 processes the rules defining the virtual workbook and selectsa particular concrete workbook, for example, RegMgr_MonthlySummary. Theconcrete workbook is then instantiated by the calculation server 310 andthe print area, or an alternatively specified output range, of thespecified worksheet within the workbook is rendered as, for example,HTML and returned to the network browser 350. The system 300 can alsorender this returned data in many different formats besides the standardHTML. To allow the user to organize this returned data by userpreference, the system 300 can make this data interactive.

For example, as illustrated in FIG. 5 the summary report 510 includes atable portion 512 representing a data range, similar to the data rangeof table 425 (FIG. 4) described above. The system 300 renders a datarange in different output formats (e.g., HTML) in such a way as toprovide partial interactivity. Table portion 512 illustrates an exampleof an HTML rendering of a data range that is interactive. The creator ofa spreadsheet containing a data range (e.g., table 425) is notnecessarily the same person who might view an instantiation of thespreadsheet (e.g., table portion 512) through a network browser 350.Using system 300, the creator of the spreadsheet can define the datarange 512 to have certain interactive features when a user views thatdata range 512. These interactive features allow some operations on thedata, but not necessarily all of the operations available to the creatorof the data range, since the creator is working through a differentinterface, for example the interface shown in FIG. 4.

For example, the system 300 can allow sorting of columns of the datarange 512 when rendered in a network browser 350, for example byclicking with a mouse, using keystrokes and/or other user input. Thisallows the viewer of the information to sort the table in ascending ordescending order based on the alpha/numeric entries of a selectedcolumn. As another example, the rows of a data range might be grouped ina hierarchy, for example, a geographical hierarchy of continents,countries and states or provinces. The creator can define the data rangeso that the system 300 allows a user to interactively expand andcollapse the hierarchy using, for example, mouse clicks, keystrokesand/or other user input.

To instantiate the RegMgr_MonthlySummary workbook, the calculationserver 310 binds the workbook parameters to their corresponding values(e.g. “person” to “Harvey Mackay” and period to “2002”). Then thecalculation server 310 evaluates all formulas in the workbook to ensurethey are up to date. In one embodiment, the system 300 implements theworkbook parameters 110′ as named quantities within the workbook andworkbook formulas can refer to these names directly. As illustrated, auser can also change parameter values using the pull-down menus 525 aand 525 b. In response to changing the parameter values and clicking onthe “GO” button 530, the system 300 will re-evaluate the virtualworkbook 205, selecting a concrete workbook 105′, which may be the sameas the prior concrete workbook (RegMgr_MonthlySummary) or may bedifferent. The concrete workbook is then instantiated and rendered asdescribed above using the changed parameter values.

It is also possible to access system quantities through interfaces otherthan by requesting reports through a web browser 350. For example, thesystem 300 allows exposure to quantities via web services 330. One webservice 330 can be defined, for example, to return a matrix of monthlycommission payments for a particular year for a certain set of people,based on the person's role and the territory structure of the company.An invocation of this web service 330 might return all commissionpayments for 2002 for Territory Managers in North America.

A number of embodiments of the invention have been described.Nevertheless, it will be understood that various modifications may bemade without departing from the spirit and scope of the invention. Forexample, processing can be distributed in many different configurations,additional parameters can be defined and additional types ofcomputations can be modeled. Accordingly, other embodiments are withinthe scope of the following claims.

1. A method for utilizing a plurality of spreadsheets, the methodcomprising: defining a parameter external to the spreadsheets;associating the parameter with the plurality of spreadsheets at designtime to define a virtual workbook; defining a set of one or moreselection rules at design time to select a target spreadsheet from theplurality of spreadsheets based at least in part on the value of theparameter; receiving a value for the parameter at run time; choosing atarget spreadsheet based on the selection rules and the value of theparameter; computing the cell values in the target spreadsheet; andrendering an output based on the computed cell values.
 2. The method ofclaim 1, further comprising: associating the external parameter with aspreadsheet in the plurality of spreadsheets, wherein the parameterapplies to the spreadsheet as a whole, thereby allowing any formula inthe spreadsheet to reference the parameter; and when the spreadsheet isthe target spreadsheet, computing the cell values in the targetspreadsheet that are dependent, directly or indirectly, on theparameter.
 3. The method of claim 1, further comprising: defining anamed result external to the plurality of spreadsheets, wherein thenamed result references one or more cells within the target spreadsheet.4. The method of claim 3, further comprising: generating the namedresult based at least in part on the received parameter value and thetarget spreadsheet.
 5. The method of claim 4, further comprising:defining a format for the named result.
 6. The method of claim 1,further comprising: associating a type with the parameter at designtime, wherein the type defines a range of values that are allowable tobe received for the parameter at run time.
 7. The method of claim 1,further comprising: associating a type with the parameter at designtime, wherein the type defines attributes associated with the parameter,wherein the attributes are allowable to be used by the selection rules.8. The method of claim 2, further comprising: associating a type withthe parameter at design time, wherein the type defines attributesassociated with the parameter, wherein the attributes are allowable tobe used by formulas within the target spreadsheet.
 9. The method ofclaim 2, further comprising: storing, separate from the virtual workbookand the target spreadsheet and after the target spreadsheet closes, anintermediate value used in a calculation for the target spreadsheet; andassociating the intermediate value with the virtual workbook and a valueof the parameter used to calculate the intermediate value.
 10. Themethod of claim 9, further comprising: calculating a named result of thevirtual workbook based at least in part by using the stored intermediatevalue instead of calculating the formulas for some of the cells in thetarget spreadsheet.
 11. The method of claim 10, further comprising:automatically calculating a named result associated with the virtualworkbook without opening the target spreadsheet by using the storedintermediate value.
 12. The method of claim 11, wherein the storedintermediate value is the value of the named result.
 13. The method ofclaim 1, further comprising: distributing the calculations of the targetspreadsheet among a plurality of computing devices.
 14. The method ofclaim 13, wherein distributing the calculations of the targetspreadsheet comprises: distributing the calculations of the targetspreadsheet among a plurality of computing devices based at least inpart on the selection rules or on one or more formulas in the targetspreadsheet.
 15. The method of claim 14, wherein distributing thecalculations of the target spreadsheet comprises: distributing thecalculations of the target spreadsheet among a plurality of computingdevices based at least in part on one or more associations stored in adatabase associated with the virtual workbook.
 16. A method forutilizing a virtual workbook of claim 1, the method comprising: defininga name for a virtual workbook, wherein the name refers to the pluralityof spreadsheets and the selection rules associated with them; using aformula in a spreadsheet to refer to the virtual workbook by name;supplying a value for the parameter to the virtual workbook in theformula; and using a calculated output of the target spreadsheet of thevirtual workbook as the result of the formula.
 17. The method of claim16, further comprising: supplying the name of a named result of thevirtual workbook in the formula and using the calculated value of thenamed result as the result of the formula.
 18. A system for utilizing aplurality of spreadsheets, comprising: a list of external parameters;and a selection module to select a target spreadsheet from a pluralityof spreadsheets based on a set of selection rules defined on theplurality of spreadsheets and the external parameter list.
 19. A systemof claim 18, further comprising: a calculation module to receive a valuefor an external parameter and a target spreadsheet and generate anoutput based at least in part on the target spreadsheet and the value ofthe parameter.
 20. A system of claim 19, further comprising: aninstantiator module to calculate a named result value based on anexternal parameter value, the selection rules, the plurality ofspreadsheets and one or more cells designated as the named result.
 21. Asystem of claim 19, further comprising: a referencing spreadsheet havingone or more formulas that refers to a virtual workbook by name andsupply a value for the virtual workbook parameters, wherein the value ofthe formula is an output calculated from the target spreadsheet of thevirtual workbook based on the parameter values.
 22. A system of claim19, further comprising: a storage module to store a location of each ofthe spreadsheets in the plurality of spreadsheets, the list of selectionrules, the name of each external parameter and the list of namedresults.